10,482 research outputs found

    Bethe Ansaetze for GKP strings

    Get PDF
    Studying the scattering of excitations around a dynamical background has a long history in the context of integrable models. The Gubser-Klebanov-Polyakov string solution provides such a background for the string/gauge correspondence. Taking the conjectured all-loop asymptotic equations for the AdS_4/CFT_3 correspondence as the starting point, we derive the S-matrix and a set of spectral equations for the lowest-lying excitations. We find that these equations resemble closely the analogous equations for AdS_5/CFT_4, which are also discussed in this paper. At large values of the coupling constant we show that they reproduce the Bethe equations proposed to describe the spectrum of the low-energy limit of the AdS_4xCP^3 sigma model.Comment: 60 pages, 5 figure

    Scaling Deep Learning on GPU and Knights Landing clusters

    Full text link
    The speed of deep neural networks training has become a big bottleneck of deep learning research and development. For example, training GoogleNet by ImageNet dataset on one Nvidia K20 GPU needs 21 days. To speed up the training process, the current deep learning systems heavily rely on the hardware accelerators. However, these accelerators have limited on-chip memory compared with CPUs. To handle large datasets, they need to fetch data from either CPU memory or remote processors. We use both self-hosted Intel Knights Landing (KNL) clusters and multi-GPU clusters as our target platforms. From an algorithm aspect, current distributed machine learning systems are mainly designed for cloud systems. These methods are asynchronous because of the slow network and high fault-tolerance requirement on cloud systems. We focus on Elastic Averaging SGD (EASGD) to design algorithms for HPC clusters. Original EASGD used round-robin method for communication and updating. The communication is ordered by the machine rank ID, which is inefficient on HPC clusters. First, we redesign four efficient algorithms for HPC systems to improve EASGD's poor scaling on clusters. Async EASGD, Async MEASGD, and Hogwild EASGD are faster \textcolor{black}{than} their existing counterparts (Async SGD, Async MSGD, and Hogwild SGD, resp.) in all the comparisons. Finally, we design Sync EASGD, which ties for the best performance among all the methods while being deterministic. In addition to the algorithmic improvements, we use some system-algorithm codesign techniques to scale up the algorithms. By reducing the percentage of communication from 87% to 14%, our Sync EASGD achieves 5.3x speedup over original EASGD on the same platform. We get 91.5% weak scaling efficiency on 4253 KNL cores, which is higher than the state-of-the-art implementation

    On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference

    Full text link
    We propose a process for investigating the extent to which sentence representations arising from neural machine translation (NMT) systems encode distinct semantic phenomena. We use these representations as features to train a natural language inference (NLI) classifier based on datasets recast from existing semantic annotations. In applying this process to a representative NMT system, we find its encoder appears most suited to supporting inferences at the syntax-semantics interface, as compared to anaphora resolution requiring world-knowledge. We conclude with a discussion on the merits and potential deficiencies of the existing process, and how it may be improved and extended as a broader framework for evaluating semantic coverage.Comment: To be presented at NAACL 2018 - 11 page

    Sampling and Inference for Beta Neutral-to-the-Left Models of Sparse Networks

    Full text link
    Empirical evidence suggests that heavy-tailed degree distributions occurring in many real networks are well-approximated by power laws with exponents η\eta that may take values either less than and greater than two. Models based on various forms of exchangeability are able to capture power laws with η<2\eta < 2, and admit tractable inference algorithms; we draw on previous results to show that η>2\eta > 2 cannot be generated by the forms of exchangeability used in existing random graph models. Preferential attachment models generate power law exponents greater than two, but have been of limited use as statistical models due to the inherent difficulty of performing inference in non-exchangeable models. Motivated by this gap, we design and implement inference algorithms for a recently proposed class of models that generates η\eta of all possible values. We show that although they are not exchangeable, these models have probabilistic structure amenable to inference. Our methods make a large class of previously intractable models useful for statistical inference.Comment: Accepted for publication in the proceedings of Conference on Uncertainty in Artificial Intelligence (UAI) 201
    • …
    corecore